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Severe acute respiratory syndrome coronavirus (SARS-CoV) main protease (M pro ), a protein required for 
the maturation of SARS-CoV, is vital for its life cycle, making it an attractive target for structure-based 
drug design of anti-SARS drugs. The structure-based virtual screening of a chemical database containing 
58 855 compounds followed by the testing of potential compounds for SARS-CoV M pro inhibition leads to 
two hit compounds. The core structures of these two hits, defined by the docking study, are used for further 
analogue search. Twenty-one analogues derived from these two hits exhibited IC 50 values below 50 /uM , 
with the most potent one showing 0.3 /uM. Furthermore, the complex structures of two potent inhibitors 
with SARS-CoV M pro were solved by X-ray crystallography. They bind to the protein in a distinct manner 
compared to all published SARS-CoV M pro complex structures. They inhibit SARS-CoV M pro activity via 
intensive H-bond network and hydrophobic interactions, without the formation of a covalent bond. 
Interestingly, the most potent inhibitor induces protein conformational changes, and the inhibition mechanisms, 
particularly the disruption of catalytic dyad (His41 and Cysl45), are elaborated. 


Introduction 

Severe acute respiratory syndrome (SARS), a new respiratory 
disease caused by a novel coronavirus, SARS coronavirus 
(SARS-CoV), 1 ’ 2 spread rapidly all over the world in 2003 and 
infected more than 8000 people, resulting in approximately 800 
deaths worldwide with mortality rates reaching over 40% in 
certain populations. 3 ’ 4 Developments of drugs and vaccines are 
vigorously being pursued, but these are still quite far from 
clinics. 

SARS-CoV, an enveloped positive-strand RNA virus from 
the Coronaviridae family, 5 codes for two very large polyproteins, 
namely, ppla (~450 kDa) and pplb (~750 kDa), that mediate 
all the functions required for viral replication and transcription. 
To be functional, these polyproteins need to be processed by 
the 33.8 kDa main protease (M pro ), also called the 3C-like 
protease (3CL pro ). 6 For its important role in SARS-CoV matura¬ 
tion and infection, M pro has been suggested as a promising target 
for anti-SARS agent design. 

The crystal structures of SARS-CoV M pro have been solved 
recently, 7-11 revealing that it forms a homodimer with three 
domains in each monomer. The antiparallel ^-barrel structure 
of domains I and II is similar to other coronavirus proteases 
and forms a chymotrypsin-like fold responsible for catalytic 
reactions. The catalytic dyad residues His41 and Cysl45 are 
located at the cleft between domains I and II. The third domain, 
C-terminal a-helical domain, is very diverse among the picor- 
navirus and coronavirus M pro . It has been reported that domain 
III existed as a stable dimer even at a very low concentration, 
indicating that this extra domain contributes to the dimerization 
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of SARS-CoV M pro and therefore switches the enzyme from 
the inactive form (monomer) to active form (dimer). 12 Ad¬ 
ditionally, the N finger (residues N1~N7) located in the same 
area also contributes to the dimerization of the two monomers. 
Availability of protein structures and the biological character¬ 
istics of SARS-CoV M pro provide insights on the substrate 
binding site, making it an attractive target for structure-based 
drug design in an effort to discover more potent and specific 
inhibitors against it. 

Inhibitors of SARS-CoV M pro have been identified by various 
computational methods. 13-19 For examples, Liu et al. 14 and 
Dooley et al. 15 identified the inhibitors using 3D structure 
derived from molecular dynamic simulation of SARS-CoV M pro 
as a virtual screening target structure, while others used the 
pharmacophore model to predict potential inhibitors. 20 ’ 21 The 
discovery efforts by computer-aided drug design showed only 
a few cases of SARS-CoV M pro inhibition potency at micromolar 
range as confirmed by bioassay. These results indicate there is 
still a vacuum that needs to be filled to find more potent 
inhibitors against SARS-CoV M pro . 

Moreover, although a number of nonpeptide inhibitors of 
SARS-CoV M pro have been discovered, such as bifunctional 
arylboronic acids, 22 isatin derivatives, 23 polyphenols, 24 etacrynic 
acid analogues, 25 cinanserin, 26 and other chemically diverse 
small molecules, 15,27 ’ 28 the lack of structure biology information 
on these compounds and their interactions with SARS-CoV M pro 
further makes the design more difficult. All the published 
structures up-to-date are complexed with peptidyl inhibitors 
through covalent bonding to SARS-CoV M pro . 7 ’ 8 ’ 10 ’ n Therefore, 
there is an urgent need to obtain the molecular insight of small 
molecule compounds to SARS-CoV M pro to design more potent 
and specific drugs against it. 

In this study, we perform the structure-based virtual screening 
on a chemical database containing 58 855 compounds based 
on the 3D structure of SARS-CoV M pro . Active compounds, 
selected from virtual screening approach and confirmed by the 
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Figure 1. SARS-CoV M pro binding site. 


bioassay, were taken as the templates to build the core structure 
for analogue search. The selected 42 analogues were again 
evaluated in a SARS-CoV M pro inhibition assay. Out of these 
analogues, 21 compounds showed inhibition activity against 
SARS-CoV M pro with IC 50 values less than 50 pM, with the 
most potent one showing 0.3 pM. Finally, the complex structures 
of potent inhibitors with SARS-CoV M pro were solved by X-ray 
crystallography to further study the SARS-CoV M pro inhibition 
mechanisms of these compounds. 

Results and Discussion 

Identification of Novel SARS-CoV M pro Inhibitors by 
Structure-Based Virtual Screening. The structure of SARS- 
CoV M pro in complex with CMK, a substrate analogue (PDB 
ID 1UK4) 7 was used as the target to perform virtual screening 
on the May bridge databases containing 58 855 small molecules. 
The binding site includes the catalytic center (His41 and Cysl45) 
and several subsites, designated as SI (His 163, Glul66, Cysl45 
Glyl43, Hisl72, and Phel40), S2 (Cysl45, His41, and Thr25), 
S3 (Met 165, Met49, and His41), S4 (Met 165 and Glul66) and 
S5(Glnl89, Metl65, and Glul66) (Figure 1). The program 
GOLD v2.1 (CCDC Software Limited, Cambridge, U.K.) was 
used to perform virtual screening. The docked molecules were 
first ranked by the fitness score of GOLDScore function to select 
the best pose from the 20 poses generated by GOLD, followed 
by resorting with the external hydrogen-bond energy term 
implemented in GOLDScore to rank the binding affinity. As 
GOLDScore scoring function has been optimized for the 
prediction of ligand binding positions as suggested by the user 
manual, it is reasonable to employ GOLDScore to predict the 
binding pose of the compounds. The best pose of each 
compound selected by GOLDScore was therefore retained for 
the further analysis. Since the H-bonding interactions are 
important for the ligand binding, as revealed by the protease— 
substrate complex structure, the best conformer of each com¬ 
pound was then further ranked by their H-bonding interactions 
with the protease. The top 50 compounds ranked by the external 
hydrogen-bond energy term, a subcomponent in GOLDScore, 
were then purchased and experimentally evaluated for their 
ability to inhibit SARS-CoV M pro . Of these, two compounds 
were found to inhibit SARS-CoV M pro more than 50% at 10 
pM (Figure 2). These two compounds, compound 1 [6-methoxy- 
3-nitro-2-(phenylsulfonyl)pyridine] and 2 (2-({[3-(4-chlorophen- 
yl)-1,2,4-oxadiazol-5-yl]methyl}thio)-4,5-dihydro- l//-imidazol-3-ium 
chloride) (Figure 2), were then subjected to the second round 
of virtual screening. 



58% inhibition 
at 10 fxM 



61% inhibition 
at 10 pJVI 


Figure 2. Structures of the hit compounds. 


This is the first report of using single hydrogen-bond energy 
to rank and select compounds. This method could be applied 
to other proteins with H-bond-rich active sites or implemented 
at different stage of virtual screening to predict the H-bonding 
interactions with the protein. 

Identification of Core Structure and Analogue Search: 
(a) Docking Study of Compound 1. Compound 1 (Figure 2) 
inhibited SARS-CoV M pro activity by 58% at 10 pM and was 
docked into the active site of the protease in the second run of 
virtual screening. The docking model (Figure 3) proposed that 
the benzene ring of compound 1 made strong hydrophobic 
interactions with the catalytic dyad, residues Cysl45 and His41. 
The substituted nitro group of pyridine ring formed three 
H-bonds with His 163, Cysl45, and Seri44. In addition, the 
sulfone group was hydrogen-bonded with Seri44 and Glyl43. 
As revealed in the docking model, the two rings (benzene and 
pyridine) together with the sulfone moiety made important 
interactions with the protein and were therefore identified as 
the scaffold for a further analogue search. Several criteria were 
applied in the analogue search (Figure 3). The two rings could 
be individually replaced by six-membered aryl or heteroaryl ring. 
The sulfone group, which functioned as a linker and made 
interactions with the protein, was retained in the core structure. 
To increase the analogue diversity, the substituents on the rings 
were not limited. A total of 151 compounds that fulfilled the 
above criteria were selected from the Maybridge database and 
were then filtered by use of molecular mass (<1000 Da) and 
structure diversity as the screens to remove the large and 
redundant compounds to a total of 28 compounds. These 28 
compounds were then redocked to SARS-CoV M pro to exclude 
the compounds without important interactions with the protein. 
His 163 and Glul 66 are highly conserved residues among 
coronavirus main proteases, and the specific hydrogen-bond 
interactions between PI-Gin and these two residues result in 
the specificity for Gin at the PI site. Keeping in view the 
importance and specific characteristics of the SI site residues 
(Glul 66 and His 163) and the catalytic dyad (Cysl45 and His41), 
the compounds without any interactions with Glul 66 , Hisl63, 
Cysl45, and His41 were withdrawn. Finally, 23 compounds 
were selected and 21 of them available commercially were 
purchased. Their ability to inhibit SARS-CoV M pro was evalu¬ 
ated in a bioassay. Out of 21 compounds, 12 compounds showed 
IC 50 values less than 50 pM (Table 1), and compound 3 (Table 
1) exhibited the most potent inhibition with an IC 50 of 0.3 pM. 
It displayed a significantly improved potency over the initial 
hit, compound 1, which makes it attractive to become a possible 
drug lead. Therefore, compound 3 was subjected to further 
characterization by structural biology studies. 

(b) Docking Study of Compound 2. The other hit compound, 
2, (Figure 2), inhibited SARS-CoV M pro activity by 61% at 10 
pM. The same strategy as described for compound 1 was 
employed to identify the core structure and search for analogues 
of compound 2. The predicted model (Figure 3) showed that 
the dihydroimidazole ring of 2 fitted into the S2 hydrophobic 
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Figure 3. Identification of core structure and analogue search. The core structures of two initial hits, compounds 1 and 2, are defined by docking 
studies and used for analogue search. Two filters, molecular weight and structure diversity, are applied after analogue search. Finally, the second 
round of docking study is applied to exclude the compounds without any interaction with the important residues, namely, Glul66, Hisl63, Cysl45, 
and His41. 


pocket and the oxadiazole ring was close to the SI pocket. 
Moreover, the chlorobenzene occupied the S4 and S5 subsites, 
which are solvent-accessible. The dihydroimidazole ring formed 
a H-bond with Cysl45 and made close contacts with Glyl43. 
The oxadiazole group formed hydrophobic interactions with 
Glul 66 in the SI site, while the chlorobenzene group made close 
contacts with Gin 189 and Pro 168. As the dihydroimidazole and 
oxadiazole had interactions with the key residues, Cysl45 and 
Glul 66 , these two heterocyclic rings together with the linker 
were identified as the scaffold for the further analogue search. 
Several criteria were applied in the analogue search (Figure 3). 
The dihydroimidazole and oxadiazole rings could be individually 
replaced by a five-membered aryl or heteroaryl ring. The linker 
between dihydroimidazole and oxadiazole could be replaced by 
other linkers with a length equal to three C—C bonds to retain 
the relative position of the two rings. In view of cholorobenzene 
occupying the less specific S4 and S5 sites and to increase the 
analogue diversity, this part was kept flexible. A total of 223 
compounds that fulfilled the above criteria were selected from 
the Maybridge database. The same filters and docking study as 
described for compound 1 were carried out to exclude the 
compounds with large molecular weight and lack of interactions 
with the important residues. Twenty-one compounds were finally 
purchased for the bioassay. Nine out of 21 compounds showed 
significant inhibition activity against SARS-CoV M pro with IC 50 
values less than 50 /uM . (Table 1). The analogues within this 
family were diverse. Of the nine active compounds, compound 
15 (Table 1) showed potent inhibition with an IC 50 of 3 /uM 
and was further studied by X-ray crystallography. 

A flowchart representing various stages of the structure-based 
virtual screening, including the docking study and subsequent 
analogue search, is shown in Figure 4. 

Overall Structure of SARS-CoV M pro . Structural biology 
studies were carried out to elucidate the interactions of the potent 
inhibitors with SARS-CoV M pro . The native structure, SARS- 
CoV M pro /3, and SARS-CoV M pro /15 were solved to a resolution 
of 2.17, 1.86, and 1.97 A, respectively (Table 2). The asym¬ 
metric unit contained only one monomer. The electron density 
maps of all residues of SARS-CoV M pro (residues 1—306) are 


clear except for the region of residues 45—48, which is flexible 
in all published structures. 

The overall structure of the SARS-CoV M pro structure is very 
similar to the published structures except for residues 45—48 
and the Asnl42 residue. The flexible loop of residues 45—48 
is located at the entrance of the active site and is flexible in all 
published structures. Its flexibility could probably allow the 
access of a ligand to the binding site of SARS-CoV M pro . In 
contrast to the dramatic change in Asnl42 upon ligand binding 
as described by Yang et al., 8 Asnl42 retains the same 
conformation in our native and complex SARS-CoV M pro 
structures. The conformation of Asnl42 in our structures is the 
same as in the ligand-binding form described by Yang et al. 

Structure of SARS-CoV M pro in Complex with Compound 
3. As revealed in the crystal structure (Figure 5), 3 adopts a 
distinct binding mode compared to all the published struc¬ 
tures. 7 ’ 8 ’ 10 ’ u It occupies the S3~S5 pockets of SARS-CoV M pro . 
The 2,4-dichloro-5-methylbenzene group inserts deep into the 
hydrophobic pocket consisting of residues Pro39, His41, 
Cysl45, Hisl63, Hisl64, Phel81, Tyrl82, and Phel85. The 
phenyl ring makes strong Jt—Jt interactions with the side chain 
of His41, while the substituents, dicholoro and methyl groups, 
have close contacts with Cysl45, His 164, Pro39, and Leu27. 
Moreover, the l,3-dinitro-5-(trifhioromethyl)benzene group 
forms intensive H-bonding interactions with the protein. One 
of the nitro groups forms a direct H-bond with the nitrogen on 
the side chain of His41 and two indirect H-bonds with Met49 
and His41 via water molecule W75. The trifluoromethyl 
substituent forms a weak H-bond with Gin 192 and has close 
contacts with Gin 192, Gin 189, Leu 167, and Met 165. In addition, 
the benzene group forms hydrophobic interactions with Met 165. 
Moreover, the sulfone group makes H-bonding interactions with 
water molecule W261. 

Upon the binding of 3 with the protein, the side chain of 
His41, which constitutes an important catalytic dyad with 
Cysl45 of SARS-CoV M pro , undergoes a dramatic conforma¬ 
tional change. In the SARS-CoV M pro , the imidazole group of 
His41 acts as a proton acceptor to make Sy of Cysl45 act as a 
nucleophile. The distance between His41 NE2 and Cysl45 Sy 
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Table 1. Chemical Structures of SARS-CoV M pro Inhibitors with IC50 


a Data are shown as mean ± SEM (n = 3 ). 

o 

is 3.69 A in the native structure. Upon the binding of compound 

o 

3, His41 moves away from Cysl45 to a distance of 9.07 A to 
accommodate the 2,4-dichloro-5-methylbenzene group of 3 
sandwiched between Cysl45 and His41. The movement of 
His41 completely blocks the catalytic dyad function and results 
in the inhibition of SARS-CoV M pro activity, which could 
provide the structural basis for the inhibition of 3 against the 
protease. The shift of His41 consequently results in the 
movement of its adjacent residue, Met49. In addition, Met 165 
also moves away to accommodate the nitro group on 3. 

The conformational change of the catalytic dyad is also seen 
in the inhibitor binding structure of caspase l, 29 where the side 
chain of His237 is rotated from a +gauche to a trans conforma¬ 
tion, creating a large hydrophobic pocket next to the PI site. 
The benzene ring of the inhibitor forms strong Jt—Jt interactions 


with the side chain of His237, leading to the inhibition against 
the protease. 

Analysis of the SARS-CoV M pro /3 structure reveals that one 
of the nitro groups is close to the side-chain imidazole group 
of His41. The distance between the oxygen atom of nitro group 
and NE2 of His41 is about 2.87 A. The nitro and the histidine 
imidazole group are both charge-bearing functional groups, as 
the nitro carries a negative charge and the histidine imidazole 
group carries a positive charge. The electrostatic interactions 
between the nitro and the histidine imidazole group are likely 
the major force responsible for triggering the dramatic confor¬ 
mational change of His41. 

Structure of SARS-CoV M pro in Complex with Compound 

15. The chemical structure of compound 15 can be subdivided 
into three groups for discussion of its interactions with the 
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Table 2. X-ray Data Collection and Structure Refinement 



native 

3 

15 

o 

resolution (A) 

20-2.17 

30-1.86 

30-1.97 

unit cell C2(a = y = 90°) 
a, A 

108.195 

107.776 

108.279 

b, A 

82.419 

82.777 

82.107 

c, A 

53.609 

53.579 

53.407 

13, deg 

104.98 

104.931 

104.66 

total reflections observed 

199 770 

361 184 

1 269 862 

unique reflections 

24 317 

35 887 

29 584 

multiplicity 

8.215 

10.06 

42.9 

Emerge, % (outer shell) 

5.3 (39.7) 

4.7 (51) 

4.5 (43.3) 

(1/0(1)) (outer shell) 

12.7(1.89) 

23.9 (2.4) 

41.3 (4.5) 

completeness, % (outer shell) 

97.7 (93.5) 

98.6 (99.8) 

99.1 (99.9) 

Rwork? % 

20.9 

20.4 

21.4 

Rfree, % 

24.8 

23.3 

24.0 

RMS bonds, A 

0.011 

0.008 

0.006 

RMS angles, deg 

1.328 

1.619 

1.337 

average B value 
protein 

35.181 

30.421 

40.156 

solvent 

41.496 

41.465 

48.303 

ligand 


62.606 

66.079 


protein (Figure 6). The first group is the triazole group that 
inserts deep into the S2 pocket, making hydrophobic contacts 
with Cysl45 and Asnl42 and H-bonding interactions with the 
side chains of Cysl45 and Asnl42. The trifluoromethyl sub¬ 
stituent on triazole group makes close contacts with the catalytic 
dyad residues, Cysl45 and His41, and SI pocket residues, 
Glyl43 and Serl44. The second group is the furan group, which 
forms hydrophobic interactions with Glul66 and an indirect 
hydrogen bond via water molecule W16 with the main chain 
of Glul66. The third group, benzene, extends to SARS-CoV 
M pro S4 and S5 pockets and makes extensive hydrophobic 
interactions with the surrounding residues including Met 165, 
Glul66, Glnl89, Glnl92, and Prol68. The oxygen atom of the 
carbothioate group, the linker connecting the first and second 
group, forms a H-bond with the side chain of Asnl42 and two 
indirect H-bonds with the side chain of Glul66 and the main 



Figure 5. Structure of SARS-CoV M pro in complex with compound 3 
(pink). The binding of compound 3 to SARS-CoV M pro induces the 
shift of imidazole group His41 (orange before inhibitor binding, gray 
after inhibitor binding), resulting in the collapse of catalytic dyad 
function. The H-bonding interactions are shown as dotted lines. 
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Figure 6. Structure of SARS-CoV M pro in complex with compound 
15 (cyan). Compound 15 binds to the protein through hydrophobic 
interactions with the surrounding residues and makes the H-bonding 
interactions to Cysl45 and Asnl42. In addition, it forms indirect 
H-bonds to Glul66 and Phel40 through water molecule. Met 165 and 
Gin 189 are shifted to accommodate the benzene group of compound 
15 (orange before inhibitor binding, gray after inhibitor binding). The 
H-bonding interactions are shown as dotted lines. 


chain of Phel40 via water molecule W173. Compound 15 
retains the important H-bonding interaction with Cysl45, similar 
to the initial hit, 2. However, the more intensive H-bond network 
with Asnl42, water and Glul66 and additional hydrophobic 
interactions increase its potency to 3 /uM . 

Compared to the native protein structure, SARS-CoV M pro 
protein residues show no dramatic conformational change upon 
binding with 15 except for the residues Metl65 and Glnl89. 
Gin 189 moves close to the benzene ring of 15 and consequently 
leads to the shift of its neighboring residue, Met 165. 

Comparison of 3 and 15 to Other Complex Structures of 
SARS-CoV M pro . There have been several published structures 
of SARS-CoV M pro in complex with inhibitors till now. 7,8,10,11 
All of them are peptide-like inhibitors bonding covalently with 
the protease. Compared to these complexed structures, com¬ 
pounds 3 and 15 bind noncovalently to SARS-CoV M pro and 
inhibit the protease in a distinct manner. To further explore the 
difference, the structure of SARS-CoV M pro bound with APE 
(azapeptide epoxides), a substrate-like inhibitor, was superim- 
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Figure 7. (A) Superimposition of the structures of 3 (pink) and APE (orange) in the binding pocket of SARS-CoV M pro (B) Superimposition of 
the structures of 15 (cyan) and APE (orange) in the binding pocket of SARS-CoV M pro . 


posed with the complex structures of 3 and 15 individually 
(Figure 7). 

For compound 3 (Figure 7A), it occupies a similar position 
to P3~P5 parts of APE. P3~P5 of APE form hydrophobic 
interactions with Alal91, Pro 168, and Met 165 together with 
the H-bonding interactions with Glul66 and Gin 189. These 
interactions are also observed in the structure of compound 3, 
except for the H-bonds to Glul66 and Gin 189. The most 
significant difference between these two structures is that the 
conformation of His41 in the APE complexed structure remains 
unchanged, whereas His41 shifts away from Cysl45 upon the 
binding of compound 3. 

Superimposition of the complex structure of compound 15 
(Figure 7B) with APE reveals that compound 15 binds to the 
PI, P2, P4, and P5 sites of APE. In this region, APE makes 
hydrophobic interactions with Glul66, Met 165, Cysl45, His41, 
His 163, Pro 168, and Gin 192 and H-bonding interactions to 
Glul66, His 163, and Glyl43. In addition, it forms a covalent 
bond to Cysl45. Although there is no covalent bond formed 
between compound 15 and Cysl45, it does form one H-bond 
with Cysl45 and two H-bonds with Asnl42. The residues 
involved in hydrophobic interactions between compound 15 and 
protein are Glnl89, Prol68, Glul66, Cysl45, Glyl43, Glnl92, 
Met 165, His41, Seri44, and Asnl42. These hydrophobic 
interactions are similar to those of APE. 

Conclusion 

In this study, novel nonpeptide inhibitors against SARS-CoV 
M pro are discovered by structure-based drug design, a combina¬ 
tion of virtual screening, docking study, and analogue search. 
This strategy could successfully identify nonpeptide small 
molecules with inhibition in the nanomolar range. To our 
knowledge, compound 3 is the most potent inhibitor of SARS- 
CoV M pro discovered by the computer-aided drug design 
method, without chemical synthesis effort involved. Moreover, 
the structural biology studies reveal that two potent inhibitors, 
compounds 3 and 15, adopt distinct binding modes as compared 
to other published structures. The shift of His41 away from 
Cysl45 as observed in the SARS-CoV M pro /3 structure results 
in the complete loss of the catalytic dyad function of the 
protease, providing an insight into the inhibition mechanism 
against SARS-CoV M pro . Moreover, the structure of SARS-CoV 
M pro in complex with compound 15 shows that the inhibitor 
forms H-bonding interactions to Cysl45 instead of covalent 


bonding as seen in ah published structures. Both binding modes 
reveal novel inhibition mechanisms for SARS-CoV M pro and 
could provide a rationale for the next generation of inhibitor 
design. 

Experimental Section 

Database Preparation. The Maybridge (58 855 compounds) 
(Tintagel, Cornwall, U.K.) 2D compound database in SDF for¬ 
mat were processed to remove salts and converted to 3D structures 
by the Insight II program module DB_CONVERT. Protonation 
states were assumed in the standard setting as suggested by 
DB_CONVERT. The Extended_Chains and Chair_Confs_Only 
parameters were set to off and the Rand_Chiral_Centers parameter 
was set to 0. 

Protein Preparation. The crystal structure of the SARS-CoV 
M pro in complex with CMK (PDB code 1UK4) was used. The 
protonation states of residues were adjusted to the dominant ionic 
forms at pH 7.5. The bound inhibitor and water were removed in 
the docking run. 

Docking. Docking was performed with GOLD version 2.1 
(CCDC Software Limited, Cambridge, U.K.). The default parameter 
settings for library screening were used except the early-termination 

•o 

option was set to off. Residues within a radius of 10 A around the 
Sy atom of Cysl45 were defined as the active site for docking 
study. Twenty genetic algorithm (GA) runs were carried out for 
each compound. For each GA run, the selection pressure was set 
to 1.1, and 100 000 GA operations were performed on a set of five 
islands with a population size of 100 individuals. The operator 
weights for crossover, mutation, and migration were set as the 
default values. Cutoff values of 2.5 A for hydrogen bonds and 4.0 

o 

A for van der Waals were applied to allow a few bad bumps and 
poor hydrogen bonds in the beginning of a GA run. 

SARS-CoV Main Protease Inhibition Assay. SARS-CoV M pro 
inhibition assay was performed by fluorescence resonance energy 
transfer (FRET) based on the previous published procedure. 30,31 The 
gene of SARS-CoV M pro was amplified from whole viral genomic 
DNA by PCR and cloned into Escherichia coli expression vector 
pET32Xa/LIC. The recombinant protein was expressed in E. coli 
BL21 with a 6x-His tag. The protein was purified by Ni—NT A 
agarose column (Qiagen, Valencia, CA) and cleaved by FXa 
protease to remove the His tag. The purified SARS-CoV M pro has 
authentic sequence without extra amino acids, confirmed by 
N-terminal sequencing and mass spectrometry. FRET assay was 
performed at 25 °C in buffer containing 20 mM bis [(2-hydroxy- 
ethyl)amino]tris(hydroxymethyl)methane (pH 7.0). The fluorogenic 
substrate peptide (Dabcyl-KTSAVLQ-SGFRKME-Edans) cleaved 
by SARS-CoV M pro emitted fluorescence and the enhanced 
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fluorescence was monitored at 538 nm with excitation at 355 nm 
by use of a fluorescence plate reader. The IC 50 value of each 
inhibitor was measured in a reaction mixture containing 50 nM 
SARS-CoV M pro , 6 fiM fluorogenic substrate, and various concen¬ 
trations of the inhibitor. The IC 50 value was obtained by plotting 
the initial velocities of the inhibited reactions against the different 
inhibitor concentrations by use of the following equation: 

A[I]=A[0]x{1-[[I]/([I]+IC 50 )]} 

where A [I] is the enzyme activity with inhibitor concentration [I] 
and A[0] is the enzyme activity without inhibitor. 

Protein Purification, Crystallization, and Structure Deter¬ 
mination. SARS-CoY M pro was expressed in E. coli BL21 host 
cell under the control of T7 promoter. The recombinant protein 
contained 6 x-His tag and was first purified by Ni—NT A col¬ 
umn. The His fusion protein was then cleavage by FXa protease 
to remove the tag and the mixture was loaded onto the second 
Ni—NT A column to obtain the pure protein. The purity of the 
protease was >95% pure as checked by SDS—PAGE, and the 
purified protein was subsequently concentrated for crystallization. 

SARS-CoV M pro was crystallized in the absence and presence 
of the inhibitors. Crystals were grown by mixing 1.5 juL of protein 
solution [10.0 mg/mL in a buffer of 12 mM Tris-HCl (pH 7.5), 1 
mM DTT, 120 mM NaCl, 0.1 mM EDTA and 7.5 mM /Tmercap- 
toethanol] with 1.5 pL of well solution ( 6 % PEG-6000, 2 mM DTT, 
and 0.1 M Mes, pH 6.0). For compound 3 and 15, protein solutions 
were incubated with 2 mM compounds for 2 h on ice in advance. 
After 3^7 days at 18 °C, tetragonal crystals grew to an average 
size of 0.2 mm. The crystals were soaked in a cryoprotectant 
solution of mother liquor with 20% glycerol for 30 s before being 
flash-frozen in liquid nitrogen. 

Diffraction data were collected at two synchrotron radiation 
centers. The native crystal diffraction data were collected at NSRRC 
BL17B beamline. The SARS-CoV M pro /3 and SARS-CoV M pro /15 
diffraction data were collected at Spring 8 SP12B2 and NSRRC 
BL13B1 beamlines, respectively. All data were collected on an 
ADSC Quantum 4R CCD detector at 100 K. All data sets were 
scaled and integrated by HKL 2000 . 32 Molecular replacement was 
performed by MOLREP 33 to solve the structures by use of the 
monomer of published SARS-CoV M pro structure (PDB code 1UK4, 
A chain) as the search model. The structures were then refined by 
REFMAC , 34 CNS , 35 and SHELX 36 together with several rounds of 
manual model-building in O . 37 All the figures were drawn by 
PyMOL (DeLano Scientific LLC, San Francisco, CA). The coor¬ 
dinates and structure factors have been deposited in the Protein 
Data Bank with accession codes 2GZ9, 2GZ7, and 2GZ8 for SARS- 
CoV M pro native protein, SARS-CoV M pr °/compound 3, and SARS- 
CoV M pr °/compound 15, respectively. 
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